Basecall summary

Note: In computational biology, N50 is statistics of a set of contig or scaffold lengths. The N50 is similar to a mean or median of lengths, but has greater weight given to the longer contigs. It is used widely in genome assembly, especially in reference to contig lengths within a draft assembly.

Basecalled reads length

Basecalled reads length for all barcodes

Blue line: Median

Red line: Mean

Explanation: Basecalled reads length represents distribution plot (distplot) to show the relationship between read density on y-axis and basecall length as a logarithmic scale on x-axis for all barcodes in FASTQ file.

Note:
- A distplot or distribution plot depicts the variation in the data distribution.
- A logarithmic scale (or log scale) is a way of displaying numerical data over a very wide range of values in a compact way.

Basecalled reads length for each barcode

Explanation: Basecalled reads length represents histogram plot (histplot) to show the relationship between count as number of reads on y-axis and basecall length as a logarithmic scale on x-axis for each barcode in FASTQ file.

Note:
- A histplot or histogram plot is an excellent tool for visualizing and understanding the probabilistic distribution of numerical data or image data.
- A logarithmic scale (or log scale) is a way of displaying numerical data over a very wide range of values in a compact way.

Quality score summary

Explanation: The quality score summary table shows the descriptive statistics information in each barcode arrangement.

Basecalled reads PHRED quality

Red line: Cut-off line suggestion (Mean quality score at 8.0)

Explanation: Basecalled reads PHRED quality distplot represents the distribution of mean quality score for all reads.

Red line: Cut-off line suggestion (Mean quality score at 8.0)

Explanation: Basecalled reads PHRED quality histplot represents the frequency distribution of mean quality score in each barcode arrangement.

Number of reads per quality score

Explanation: Number of reads per quality score plot represents the proportion between the number of the passed and failed reads.

Read Length vs Quality Score Summary

This last section illustrates the relationship between read length and quality score. In other word, you could conveniently see how many good read there are at each length.

Scatter plot

Explanation: Each dot represents each read in the fastq. The X-axis is the log10-transformed read length, while the y-axis is the quality score. Reads with different barcode are plot separately. Scatter plot provide a simple explicit visualization of how length and quality distributed across dataset. Each dot is slightly opage. Thus you can observe where the dot overlap and condense.

Heat Map

Explanation: The heatmaps share the same x-y axis with scatter plot. These colorful interactive plot illustrate how many reads are in each range of read length and quality score. Heatmap draws a grid that count reads in each partition. To inteprete, yellow color represent high frequency (many reads). In contrast, purple color means the lower percentage. The percentage of each shade is as displayed in the color bar.